Towards precision medicine: discovering novel gynecological cancer biomarkers and pathways using linked data

نویسندگان

  • Alokkumar Jha
  • Yasar Khan
  • Muntazir Mehdi
  • Md. Rezaul Karim
  • Qaiser Mehmood
  • Achille Zappa
  • Dietrich Rebholz-Schuhmann
  • Ratnesh Sahay
چکیده

BACKGROUND Next Generation Sequencing (NGS) is playing a key role in therapeutic decision making for the cancer prognosis and treatment. The NGS technologies are producing a massive amount of sequencing datasets. Often, these datasets are published from the isolated and different sequencing facilities. Consequently, the process of sharing and aggregating multisite sequencing datasets are thwarted by issues such as the need to discover relevant data from different sources, built scalable repositories, the automation of data linkage, the volume of the data, efficient querying mechanism, and information rich intuitive visualisation. RESULTS We present an approach to link and query different sequencing datasets (TCGA, COSMIC, REACTOME, KEGG and GO) to indicate risks for four cancer types - Ovarian Serous Cystadenocarcinoma (OV), Uterine Corpus Endometrial Carcinoma (UCEC), Uterine Carcinosarcoma (UCS), Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma (CESC) - covering the 16 healthy tissue-specific genes from Illumina Human Body Map 2.0. The differentially expressed genes from Illumina Human Body Map 2.0 are analysed together with the gene expressions reported in COSMIC and TCGA repositories leading to the discover of potential biomarkers for a tissue-specific cancer. CONCLUSION We analyse the tissue expression of genes, copy number variation (CNV), somatic mutation, and promoter methylation to identify associated pathways and find novel biomarkers. We discovered twenty (20) mutated genes and three (3) potential pathways causing promoter changes in different gynaecological cancer types. We propose a data-interlinked platform called BIOOPENER that glues together heterogeneous cancer and biomedical repositories. The key approach is to find correspondences (or data links) among genetic, cellular and molecular features across isolated cancer datasets giving insight into cancer progression from normal to diseased tissues. The proposed BIOOPENER platform enriches mutations by filling in missing links from TCGA, COSMIC, REACTOME, KEGG and GO datasets and provides an interlinking mechanism to understand cancer progression from normal to diseased tissues with pathway components, which in turn helped to map mutations, associated phenotypes, pathways, and mechanism.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Human Cancer Modeling: Recapitulating Tumor Heterogeneity Towards Personalized Medicine

Despite diagnostic, preventive and therapeutic advances, growing incidence of cancer and high rate of mortality among patients affected by specific cancer types indicate current clinical measures are not ideally useful in eradicating cancer. Chemoresistance and subsequent disease relapse are believed to be mainly driven by the cell-molecular heterogeneity of human tumors that necessitates perso...

متن کامل

Human Cancer Modeling: Recapitulating Tumor Heterogeneity Towards Personalized Medicine

Despite diagnostic, preventive and therapeutic advances, growing incidence of cancer and high rate of mortality among patients affected by specific cancer types indicate current clinical measures are not ideally useful in eradicating cancer. Chemoresistance and subsequent disease relapse are believed to be mainly driven by the cell-molecular heterogeneity of human tumors that necessitates perso...

متن کامل

Classification and Biomarker Genes Selection for Cancer Gene Expression Data Using Random Forest

Background & objective: Microarray and next generation sequencing (NGS) data are the important sources to find helpful molecular patterns. Also, the great number of gene expression data increases the challenge of how to identify the biomarkers associated with cancer. The random forest (RF) is used to effectively analyze the problems of large-p and smal...

متن کامل

Scenario and future prospects of microRNAs in gastric cancer: A review

Carcinoma of the stomach is one of the major prevalent and principal causes of cancer-related deaths worldwide. Current advancement in technology has improved the understanding of the pathogenesis and pathology of gastric cancers (GC). But, high mortality rates, unfavorable prognosis and lack of clinical predictive biomarkers provide an impetus to investigate novel early diagnostic/prognostic m...

متن کامل

Constructing Tumor Progression Pathways and Biomarker Discovery with Fuzzy Kernel Kmeans and DNA Methylation Data

Constructing pathways of tumor progression and discovering the biomarkers associated with cancer is critical for understanding the molecular basis of the disease and for the establishment of novel chemotherapeutic approaches and in turn improving the clinical efficiency of the drugs. It has recently received a lot of attention from bioinformatics researchers. However, relatively few methods are...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2017